skip to main content


Search for: All records

Creators/Authors contains: "Ko, Seyoon"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Marschall, Tobias (Ed.)
    Abstract Motivation

    In a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive.

    Results

    We present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA’s linear mixed models and mv-PLINK’s canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits.

    Availability and implementation

    Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.

     
    more » « less
  2. Diabetes-related complications reflect longstanding damage to small and large vessels throughout the body. In addition to the duration of diabetes and poor glycemic control, genetic factors are important contributors to the variability in the development of vascular complications. Early heritability studies found strong familial clustering of both macrovascular and microvascular complications. However, they were limited by small sample sizes and large phenotypic heterogeneity, leading to less accurate estimates. We take advantage of two independent studies—UK Biobank and the Action to Control Cardiovascular Risk in Diabetes trial—to survey the single nucleotide polymorphism heritability for diabetes microvascular (diabetic kidney disease and diabetic retinopathy) and macrovascular (cardiovascular events) complications. Heritability for diabetic kidney disease was estimated at 29%. The heritability estimate for microalbuminuria ranged from 24 to 60% and was 41% for macroalbuminuria. Heritability estimates of diabetic retinopathy ranged from 6 to 33%, depending on the phenotype definition. More severe diabetes retinopathy possessed higher genetic contributions. We show, for the first time, that rare variants account for much of the heritability of diabetic retinopathy. This study suggests that a large portion of the genetic risk of diabetes complications is yet to be discovered and emphasizes the need for additional genetic studies of diabetes complications. 
    more » « less
  3. Kelso, Janet (Ed.)
    Abstract Motivation Current methods for genotype imputation and phasing exploit the volume of data in haplotype reference panels and rely on hidden Markov models (HMMs). Existing programs all have essentially the same imputation accuracy, are computationally intensive and generally require prephasing the typed markers. Results We introduce a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for HMM calculations. This strategy, embodied in our Julia program MendelImpute.jl, avoids explicit assumptions about recombination and population structure while delivering similar prediction accuracy, better memory usage and an order of magnitude or better run-times compared to the fastest competing method. MendelImpute operates on both dosage data and unphased genotype data and simultaneously imputes missing genotypes and phase at both the typed and untyped SNPs (single nucleotide polymorphisms). Finally, MendelImpute naturally extends to global and local ancestry estimation and lends itself to new strategies for data compression and hence faster data transport and sharing. Availability and implementation Software, documentation and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelImpute.jl. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less